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1 Using Lookahead to reduce memory bank contention for decoupled operand Q 
references 

Peter L. Bird, Richard A. Uhlig 

August 1991 Proceedings of the 1991 ACM/IEEE conference on Supercomputing 

Publisher: ACM Press 

Full text available: ' g!|pdf(1.09 MB) Additional Information: full citation , references , citings , index terms 



2 Preprocessors in a data communication computer environment 
David L Mills 

October 1969 Proceedings of the first ACM symposium on Problems in the 
optimization of data communications systems 

Publisher: ACM Press 

Full text available: pdf(1.26 MB) Additional Information: full citation , abstract , references , index terms 

Realizing the need for a highly adaptable transmission control unit to interface varied 
terminal equipment to the Michigan Timesharing System (MTS), the University of Michigan 
initiated in 1965 the development of a special control unit to be used in conjunction with 
the System/360 Model 67. Called the Data Concentrator The design approach taken in the 
Data Concentrator has been to nucleate about a small general-purpose computer a 
number of special-purpose interfaces to the variou ... 

A novel cache design for vector processing 
Qing Yang, Liping Wu Yang 

April 1992 ACM SIGARCH Computer Architecture News , Proceedings of the 19th 

annual international symposium on Computer architecture, volume 20 issue 2 
Publisher: ACM Press , ACM Press 

Full text available: fB pdf(1.35 MB) Additional Information: full citation, abstract, references , citings, index 

terms 

This paper introduces an innovative cache design for vector computers, called prime- 
mapped cache. By utilizing the special properties of a Mersenne prime, the new design 
does not increase the critical path length of a processor, nor does it increase the cache 
access time as compared to a direct-mapped cache. The prime-mapped cache minimizes 
cache miss ratio caused by line interferences that have been shown to be critical for 
numerical applications by previous investigators. We show that sig ... 
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A 50-Gb/s IP router 

Craig Partridge, Philip P. Carvey, Ed Burgess, Isidro Castineyra, Tom Clarke, Use Graham, 
Michael Hathaway, Phil Herman, Allen King, Steve Kohalmi, Tracy Ma, John Mcallen, Trevor 
Mendez, Walter C. Milliken, Ronald Pettyjohn, John Rokosz, Joshua Seeger, Michael Sollins, 
Steve Storch, Benjamin Tober, Gregory D. Troxel 

June 1998 IEEE/ACM Transactions on Networking (TON), Volume 6 issue 3 
Publisher: IEEE Press 

Full text available' t§) pdf(133 28 KB) Additional Information: full citation , references , citings , index terms . 
" ^ review 



Keywords: data communications, internetworking, packet switching, routing 

5 Contents of the Computer Communication Review 1970-1994 
David Oran 

January 1995 ACM SIGCOMM Computer Communication Review, Volume 25 issue l 
Publisher: ACM Press 

Full text available: ^ pdf(1.75 MB) Additional Information: full citation , index terms 




6 Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: | 
preliminary results 
W.-D. Weber, A. Gupta 

April 1989 ACM SIGARCH Computer Architecture News , Proceedings of the 16th 

annual international symposium on Computer architecture, Volume 17 issue 3 
Publisher: ACM Press , ACM Press 

Full text available: fj pdf(965.98 KB^ Additional Information: full citation , abstract, references , citings, index 

terms 

A fundamental problem that any scalable multiprocessor must address is the ability to 
tolerate high latency memory operations. This paper explores the extent to which multiple 
hardware contexts per processor can help to mitigate the negative effects of high latency. 
In particular, we evaluate the performance of a directory-based cache coherent 
multiprocessor using memory reference traces obtained from three parallel applications. 
We explore the case where there are a small fixed number (2-4 ... 

7 Multiprocessor Organization— a Survey Q 
Philip Enslow 

January 1977 ACM Computing Surveys (CSUR), Volume 9 issue l 
Publisher: ACM Press 

Full text available: ^ pdf(1.79 MB) Additional Information: full citation , references , citings , index terms 





8 Special session on reconfiqurable computing: Reconfiqurable platforms for ubiquitous j§| 
computing 

Manfred Glesner, Thomas Hollstein, Leandrp Soares Indrusiak, Peter Zipf, Thilo Pionteck, 
Mihail Petrov, Heiko Zimmer, Tudor Murgan 

April 2004 Proceedings of the 1st conference on Computing frontiers 
Publisher: ACM Press 

Full text available: ^ pdf(479.97 KB) Additional Information: full citation , abstract , references , index terms 
Ubiquitous computing requires flexibilty. Melting distributed electronic devices into 




http://portal.acm.org/resultsxfm?CFID=56142969&CFTOKEN=9142133^ 10/21/05 



Results (page 1): +bank +distribution, +multiplexor, +switch, +data +stream 



Page 3 of 6 



everyday's life implies the need to adapt to evolving standards and dynamic environments. 
Furthermore, to gain user acceptance, such devices should be able to adapt to different 
usage patterns and user profiles. Scalability is also an important issue, allowing functional 
enhancements to already deployed systems. In this work we address these issues applying 
the concept of reconfigurability on different abstract ... 

Keywords: communication, dynamic power management, networks-on-chip, 
reconfigurable hardware, reconfigurable processors, reconfiguration, ubiquitous computing 



A processor for a high-performance personal computer 
Butler W. Lampson, Kenneth A. Pier 

May 1980 Proceedings of the 7th annual symposium on Computer Architecture 
Publisher: ACM Press 

Full text available* fifl pdf(1.24 MB) Additional Information: full citation, abstract , references , citings , index 
' ^ terms 

This paper describes the design goals, micro- architecture, and implementation of the 
microprogrammed processor for a compact high performance personal computer. This 
computer supports a range of high level language environments and high bandwidth I/O 
devices. Besides the processor, it has a cache, a memory map, main storage, and an 
instruction fetch unit; these are described in other papers. The processor can be shared 
among 16 microcoded tasks, performing microcode context switches ... 

10 The VMP network adapter board (NAB): high-performance network communication 
for multiprocessors 
H. Kanakia, D. Cheriton 

August 1988 ACM SIGCOMM Computer Communication Review , Symposium 

proceedings on Communications architectures and protocols, volume 18 
Issue 4 

Publisher: ACM Press , ACM Press 

Full text available* fiflpdfd.63 MB) Additional Information: full citation , abstract , references , citings , index 
' ^ terms 

High performance computer communication between multiprocessor nodes requires 
significant improvements over conventional host-to-network adapters. Current host-to- 
network adapter interfaces impose excessive processing, system bus and interrupt 
overhead on a multiprocessor host. Current network adapters are either limited in 
function, wasting key host resources such as the system bus and the processors, or else 
intelligent but too slow, because of complex transport protocols and because of a ... 

11 A processor for a high-performance personal computer 
Butler W. Lampson, Kenneth A. Pier 

August 1998 25 years of the international symposia on Computer architecture 
(selected papers) 

Publisher: ACM Press 

Full text available: f Qpdf(1.57 MB) Additional Information: full citation , references , index terms 



12 Spatial computation 

Mihai Budiu, Girish Venkataramani, Tiberiu Chelcea, Seth Copen Goldstein 
October 2004 Proceedings of the 11th international conference on Architectural 

support for programming languages and operating systems, Volume 32 

39 , 38 Issue 5 , 11 , 5 
Publisher: ACM Press , ACM Press , ACM Press , ACM Press 



http://portal.acm.org/ra^ 10/21/05 



Results (page 1): +bank +distribution, +multiplexor, +switch, +data +stream 



Page 4 of 6 



Full text available: ^ pdf(573.00 KB) Additional Information: full citation , abstract , references , index terms 

This paper describes a computer architecture, Spatial Computation (SC), which is based on 
the translation of high-level language programs directly into hardware structures. SC 
program implementations are completely distributed, with no centralized control. SC 
circuits are optimized for wires at the expense of computation units. In this paper we 
investigate a particular implementation of SC: ASH (Application-Specific Hardware). Under 
the assumption that computation is cheaper than co ... 

Keywords: application-specific hardware, dataflow machine, low-power, spatial 
computation 



13 The Starfire SMP interconnect 

Alan Charlesworth, Nicholas Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert, 
Ricki Williams, Andrew Phelps 

November 1997 Proceedings of the 1997 ACM/IEEE conference on Supercomputing 
(CDROM) 

Publisher: ACM Press 

Full text available: Q pdf(273.52 KB) Additional Information: full citation , abstract , references , citings 

The Starfire interconnect extends the envelope of Unix symmetric multiprocessor (SMP) 
systems in several dimensions. Interconnect: an active centerplane with four address 
routers and a 16x16 data crossbar provides 64 UltraSPARC processors with uniform 
memory access at a bandwidth of 10,667 MBps. Flexibility: Starfire can be dynamically 
reconfigured into multiple hardware-protected operating system domains. Robustness: 
Failing boards can be hot swapped without interrupting sy ... 

Keywords: SMP, UMA, bandwidth, domains, interconnect, latency, partitions 



14 Architecture and implementation of a VLIW supercomputer 

Robert P. Colwell, W. Eric Hall, Chandra S. Joshi, David B. Papworth, Paul K. Rodman, James 
E. Tornes 

November 1990 Proceedings of the 1990 ACM/IEEE conference on Supercomputing 
Publisher: IEEE Computer Society 

Full text available: ^pdfd.29 MB) Additional Information: full citation , abstract , references 

Very-Long-Instruction-Word (VLIW) computers achieve high performance by exploiting the 
fine-grain parallelism present in sequential or vectorizable code. Multiflow's /200 and /300 
VLIW systems yielded near-supercomputer performance by this means despite the 
relatively slow (65 nS) clocks. With its much faster clock period (15 nS) and architectural 
improvements, the new /500 system attains approximately 4-9X the performance of its 
predecessors.This paper describes the /500 architecture and implem ... 

15 The datacycle architecture for very high throughput database systems 
Gary Herman, K. C. Lee, Abel Weinrib 

December 1987 ACM SIGMOD Record , Proceedings of the 1987 ACM SIGMOD 

international conference on Management of data, volume 16 issue 3 
Publisher: ACM Press , ACM Press 

Full text available: fflpdffl.OO MB) Additional Information: full citation , abstract, references , citings, index 
^ terms 

The evolutionary trend toward a database-driven public communications network has 
motivated research into database architectures capable of executing thousands of 
transactions per second. In this paper we introduce the Datacycle architecture, an attempt 
to exploit the enormous transmission bandwidth of optical systems to permit the 
implementation of high throughput multiprocessor database systems. The architecture has 
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the potential for unlimited query throughput, simplified data man ... 

16 Conjoined-Core Chip Multiprocessing 
Rakesh Kumar, Norman P. Jouppi, Dean M. Tullsen 

December 2004 Proceedings of the 37th annual International Symposium on 
M i c roa rc h i tect u re 

Publisher: IEEE Computer Society 

Full text available: ^ pdf(369.99 KB) Additional Information: full citation , abstract 

Chip Multiprocessors (CMP) and Simultaneous Multi-threading (SMT) are two approaches 
that have been proposed to increase processor efficiency. We believe these two 
approaches are two extremes of a viable spectrum. Between these two extremes, there 
exists a range of possible architectures, sharing varying degrees of hardware between 
processors or threads. This paper proposes conjoined-core chip multiprocessing - 
topologically feasible resource sharing between adjacent cores of a chip multiprocess ... 

17 Low power scalable encryption for wireless systems 
James Goodman, Anantha P. Chandrakasan 

January 1998 Wireless Networks, volume 4 issue l 
Publisher: Kluwer Academic Publishers 

Full text available:^ pdf(7.39 MB) Additional Information: full citation , abstract , references , index terms 

Secure transmission of multimedia information (e.g., voice, video, data, etc.) is critical in 
many wireless network applications. Wireless transmission imposes constraints not found 
in typical wired systems such as low power consumption, tolerance to high bit error rates, 
and scalability. A variety of low power techniques have been developed to reduce the 
power of several encryption algorithms. One key idea involves exploiting the variation in 
computation requirements to dynamically vary th ... 



1 8 Self-assessment procedure XII: a self-assessment procedure dealing with computer jfjj 
architecture 

Robert I. Winner, Edward M. Carter 

January 1984 Communications of the ACM, volume 27 issue i 
Publisher: ACM Press 

Full text available: |g| pdf(589.25 KB) Additional Information: full citation , references , index terms 




19 Hybrid volume and polygon rendering with cube hardware 
Kevin Kreeger, Arie Kaufman 

July 1999 Proceedings of the ACM SIGGRAPH/ EUROGRAPHICS workshop on 
Graphics hardware 

Publisher: ACM Press 

Full text available: ^pdf(1.85 MB) Additional Information: full citation , references , citings , index terms 




Keywords: cube architecture, mixing polygons and volumes, ray casting, run-length- 
encoding, volume rendering 



20 Interconnections in Multi-Core Architectures: Understanding Mechanisms. Overheads |gj 
and Scaling 

Rakesh Kumar, Victor Zyuban, Dean M. Tullsen 

June 2005 Proceedings of the 32nd Annual International Symposium on Computer 
Architecture ISCA '05 
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Publisher: IEEE Computer Society 

Full text available: ^ pdf(235.90 KB) Additional Information: full citation , abstract 

This paper examines the area, power, performance, and design issues for the on-chip 
interconnects on a chip multiprocessor, attempting to present a comprehensive view of a 
class of interconnect architectures. It shows that the design choices for the interconnect 
have significant effect on the rest of the chip, potentially consuming a significant fraction 
of the real estate and power budget. This research shows that designs that treat 
interconnect as an entity that can be independently architecte ... 
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